Is it possible to extract hyperlinks from Excel files in R? I looked through XLConnect
and xlsx
but the only thing I've found is how to write hyperlinks, not read them.
I found a super convoluted way to extract the hyperlinks:
library(XML)
# rename file to .zip
my.zip.file <- sub("xlsx", "zip", my.excel.file)
file.copy(from = my.excel.file, to = my.zip.file)
# unzip the file
unzip(my.zip.file)
# unzipping produces a bunch of files which we can read using the XML package
# assume sheet1 has our data
xml <- xmlParse("xl/worksheets/sheet1.xml")
# finally grab the hyperlinks
hyperlinks <- xpathApply(xml, "//x:hyperlink/@display", namespaces="x")
(a) great find; (b) not so convoluted. RExcelXML does something similar.